Arabic Gloss WSD Using BERT

نویسندگان

چکیده

Word Sense Disambiguation (WSD) aims to predict the correct sense of a word given its context. This problem is extreme importance in Arabic, as written words can be highly ambiguous; 43% diacritized have multiple interpretations and percentage increases 72% for non-diacritized words. Nevertheless, most Arabic text does not diacritical marks. Gloss-based WSD methods measure semantic similarity or overlap between context target that needs disambiguated dictionary definition (gloss word). gloss suffers from lack context-gloss datasets. In this paper, we present an gloss-based technique. We utilize celebrated Bidirectional Encoder Representation Transformers (BERT) build two models efficiently perform WSD. These trained with few training samples since they BERT were pretrained on large corpus. Our experimental results show our outperform recent WSDs when test them against same data used evaluate model. Additionally, model achieves F1-score 89% compared best-reported 85% knowledge-based Another contribution paper introducing benchmark may help overcome standardized

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing Gloss of Tooth using Digital Imaging

The aim of this study was to assess gloss of tooth by digital photography. A gonio-imaging system (gonio being Greek for angle) was developed to measure the gloss of human teeth in a laboratorial stage. Polarised and non -polarised images were acquired around the specular angle. The gloss component was extracted and normalised to a theoretical standard; a BRDF curve was built to describe the gl...

متن کامل

Sussx: WSD using Automatically Acquired Predominant Senses

We introduced a method for discovering the predominant sense of words automatically using raw (unlabelled) text in (McCarthy et al., 2004) and participated with this system in SENSEVAL3. Since then, we worked on further developing ideas to improve upon the base method. In the current paper we target two areas where we believe there is potential for improvement. In the first one we address the f...

متن کامل

Using Semantic Classification Trees for WSD

This paper describes the evaluation of a WSD method within SENSEVAL. This method is based on Semantic Classification Trees (SCTs) and short context dependencies between nouns and verbs. The training procedure creates a binary tree for each word to be disambiguated. SCTs are easy to implement and yield some promising results. The integration of linguistic knowledge could lead to substantial impr...

متن کامل

Neighbors Help: Bilingual Unsupervised WSD Using Context

Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recen...

متن کامل

OE: WSD Using Optimal Ensembling (OE) Method

Optimal ensembling (OE) is a word sense disambiguation (WSD) method using word-specific training factors (average positive vs negative training per sense, posex and negex) to predict best system (classifier algorithm / applicable feature set) for given target word. Our official entry (OE1) in Senseval-4 Task 17 (coarse-grained English lexical sample task) contained many design flaws and thus fa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2021

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app11062567